Feat/solo core metatheory proofs#614
Merged
Merged
Conversation
Two minimal edits, no change to the proofs' content:
- drop a redundant `simp` ("no goals to be solved") in the eval_recv
case of tropical_session_soundness;
- insert `simp only [TropicalBudget] at *` before the three `omega`
calls in budget_strictly_bounds_billing, so omega sees through the
`abbrev TropicalBudget := Nat` (4.13.0 no longer unfolds the abbrev
for omega, which otherwise reports "No usable constraints found").
No sorry/admit/axiom/native_decide. `lean TropicalSessionTypes.lean`
now exits 0.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Completes the call-by-value reduction relation and the progress half of Solo-core soundness; preservation remains (?todo_preservation). - Subst.idr (new): de Bruijn shift + single-variable substitution (subst0), structurally total, no assert_total. - Soundness.idr: Step now has its CBV constructors (beta/let/case/ projection redexes + evaluation-context congruences, per spec §4.2-4.3); progress proven by induction on the typing derivation with canonical-forms reasoning inlined; StepsTo's reduct is erased. - ContextLemmas.idr (new): IsZero predicate + Empty/scale inversions. - Typing.idr: encoding-only reformulation needed to *prove* (not just *state*) the metatheory in Idris2 — split contexts of T-App/Pair/ Case/Let are explicit (Idris2 erases indices, so they must be runtime to be analysed) and the all-zero context of T-Var/T-Unit uses the analysable IsZero premise instead of the non-invertible computed index ctxZero g. The typing *relation* is unchanged. No believe_me/assert_total/postulate; idris2 --check Soundness.idr exits 0; progress is total and hole-free. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Backs the `just proof-check-{idris2,lean,agda,all}` recipes (referenced
by the PROOF-STATUS templates but never implemented) with
tools/check-proofs.sh. Each stage re-runs the proof against its
assistant and fails on a type error or any dangerous escape hatch
(believe_me / assert_total / postulate / sorry / axiom / native_decide
/ idris_crash); unfinished Idris2 `?` holes are reported as a warning.
Coverage: Solo-core QTT metatheory (Idris2), tropical session types
(Lean 4), and the 23 echo-boundary certificates (Agda). The Agda stage
needs AFFINESCRIPT_ECHO_TYPES_DIR + AGDA_STDLIB (the certificates import
the external echo-types proofs + agda-stdlib) and skips with a clear
message when either is unset, so the harness still runs hermetically.
Verified: idris2/lean stages PASS; agda stage PASSES all 23 proofs with
the env vars set and SKIPs cleanly without them.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…lete) Completes the Solo-core soundness metatheory. progress and preservation are fully mechanised: idris2 --check Soundness.idr exits 0, every definition is total, and there is NO believe_me / assert_total / really_believe_me / postulate / ? hole. What is proved: * progress : Has Empty t a -> Either (Value t) (StepsTo t) * preservation: Has g t a -> Step t t' -> (g' ** (Weaker g' g, Has g' t' a)) * affinePreservation: same. The substitution-lemma infrastructure (ContextLemmas.idr AddCtx algebra, SubstLemma.idr substGen/substLemma0, Subst.idr de Bruijn shift/subst) was produced by a parallel proof search and re-verified locally. Mechanisation finding + owner decision (2026-06-15): The original statement asked for the reduct in the SAME context. That is FALSE for the Solo core as designed -- TPair introduces multiplicatively (split context) but is eliminated by Fst/Snd projections, so e.g. "Snd (Pair x y) steps to y" (x linear) drops x's resources, leaving a strictly smaller context. Every beta-rule preserves the context exactly (via the substitution lemma); only the projections shrink it. Resolved the AFFINE way (AffineScript is affine: lib/quantity.ml's q_le permits under-use): the language is unchanged (split product + projections + eager CBV kept) and preservation is restated affinely (reduct in a Weaker sub-context, g' <= g pointwise, same shape/types). See dev-notes/2026-06-15-affinescript-solo-core-preservation/. Typing.idr encoding note: split contexts of T-App/Pair/Case/Let are explicit AddCtx premises and T-Var/T-Unit use an IsZero premise -- an encoding change (Idris2 erases data indices) that leaves the set of derivable judgements unchanged. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… gates Two additive gates that pin compile-target validity, mirroring the check-proofs.sh pattern: - just wasm-validate (tools/wasm-validate-gate.sh): compiles the positive corpus to core wasm and runs wasm-tools validate on every module (19 pass / 0 fail / 5 allowlisted carve-outs). Closes the test_e2e.ml:601 blind spot where tests only checked codegen-did-not-raise, never byte validity. - just coprocessor-validate (tools/coprocessor-gate.sh): compiles canonical kernels to all nine Tier-C accelerator backends (WGSL/SPIR-V/CUDA/Metal/OpenCL/MLIR/ONNX/Faust/Verilog) plus LLVM, validating WGSL via naga and SPIR-V via magic word (12 pass / 0 fail). Also surfaces issues-drafts/05: the core-wasm uniform 4-byte heap-cell model mismodels any Float that transits the heap (invalid module on load, silent 32-of-64-bit truncation on store) — newly caught by the validate gate, tracked for a type-directed-layout fix. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… (ADR-0024) Commences the native Android target (owner decision 2026-06-16: ARM64-native, not JVM/Kotlin — the repo bans Kotlin/Swift for mobile). lib/llvm_codegen.ml: the LLVM target triple is now a parameter read from AFFINESCRIPT_LLVM_TRIPLE (default x86_64-unknown-linux-gnu — zero behaviour change). The emitted IR is target-independent: the same .ll lowers to both x86-64 and aarch64 objects. Verified spine: AffineScript -> LLVM IR (aarch64-linux-android) -> llc -> real ELF AArch64 object with genuine ARM64 instructions, no llc triple override needed. Pinned by just android-validate (tools/android-aarch64-gate.sh, 2 pass / 0 fail). Staged downstream (NDK-gated, documented in ADR-0024): --target CLI flag, NDK link to JNI .so, aarch64 runtime, Gossamer APK packaging. Full suite + guard green after the change. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Verifies the AffineScript <-> typed-wasm contract end-to-end across repositories: AffineScript (OCaml) emits the typedwasm.ownership v1 carrier; the sibling hyperpolymath/typed-wasm Rust tw-verify consumes it. Bit-exact agreement proven both ways: clean linear (own consumed once) -> both VERIFIED (exit 0); dropped-own -> both emit the identical L10 violation (exit 1). Pinned by just typed-wasm-validate (tools/typed-wasm-roundtrip-gate.sh, 2 pass / 0 fail). Surfaces issues-drafts/06: the ownership carrier has no Affine kind, so AffineScript's affine 'own' is mapped to Linear (L10 exactly-once) and over-rejected on legal drops — the affine-vs-linear gap at the typed-wasm boundary. Fix is a coordinated v2 carrier change (AffineScript + ephapax + typed-wasm). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Designs the VM as the semantics engine (per the locked targeting strategy: wasm=portability, native=performance, VM=semantics). Bytecode VM with reified continuations for multi-shot effects/resume (#555), a handler stack (exceptions = single-shot degenerate case), runtime affine/linear enforcement, tropical cost-metering, and a proof-oracle mode. 5 independently-gated phases (M1 interp-parity .. M5 proof oracle). 4 open questions raised for owner sign-off before implementation. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The core-wasm heap is a uniform 4-byte i32-cell model, so any Float that transits a heap cell (array element, tuple element, record field) was mismodeled: invalid wasm on load, silent 32-of-64-bit truncation on store. The interim secure fix raises Codegen.UnsupportedFeature instead of emitting corrupt bytes — matching the #555/#556 loud-fail policy and routing to -i / -julia / a GPU backend (all of which lower f64 aggregates correctly). Guards (lib/codegen.ml): guard_fn_no_heap_float on every function param + return type; guard_no_float_elems on Array/tuple/record literals. Scalar Float (stays in f64 locals) and Int aggregates are untouched. Verified: float-array load/store + float-tuple-proj now loud-fail; scalar-float + int-array + int-tuple still emit valid wasm; full suite + wasm-validate + coprocessor gates all green (GPU f64 kernels unaffected — guard is core-wasm only). just wasm-validate now pins the loud-fail (2 rej cases). The durable type-directed heap-layout fix remains tracked as task #8 / issue-draft 05 option (1). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Owner sign-off 2026-06-16. Decisions recorded: (1) substrate = CESK abstract machine (defunctionalised continuations over the AST), expressly the step toward a flat bytecode VM (the stated destination it defunctionalises into, semantics-preserving, Reynolds/Danvy); (2) VM-native tagged values (QTT kind on the tag); (3) M1 = reduced Solo core paired with the proofs, not interp parity (interp.ml keeps serving real programs); (4) runtime affine/cost enforcement on by default, --unchecked escape. Phasing updated: M1 Solo-core CESK + proof oracle from M1, M2 multi-shot effects/resume (#555), M3 exceptions, M4 harden enforcement (on by default), M5 Duet oracle, (direction) defunctionalise to bytecode VM. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…triple) A RISC-V collaboration wants native AffineScript. Because ADR-0024 parameterised the LLVM triple, riscv64 is already served at the codegen spine with zero code change: AFFINESCRIPT_LLVM_TRIPLE=riscv64-unknown-linux-gnu -> real ELF RV64 object (add a0,a0,a1 / ret). just android-validate now pins both aarch64 AND riscv64, proving the native spine is multi-target, not Android-specific. The remaining RISC-V work is the riscv64-linux runtime + link (a standard cross-toolchain — simpler than Android's NDK/APK). Recorded in ADR-0024 Consequences: a generic spine makes a new target customer a one-line change, not a project — so we serve it without reordering the roadmap. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The LLVM backend never produced a runnable executable: () lowered to {} so @main emitted invalid 'ret {} 0', and @main was not a C-runtime int main. Fixed in lib/llvm_codegen.ml — TyTuple [] -> void (like TyCon Unit), and the main entry emits 'define i32 @main()' returning 0.
Verified end-to-end: AffineScript -> LLVM IR -> x86 native binary prints 'Hello, AffineScript!' and exits 0 (just native-run examples/hello.affine). Full suite green; no golden regressions.
Completes the native tier's link+run for RISC-V's collaboration: just compile-riscv (riscv64 rv64gc/lp64d object) + just riscv-run-validate (tools/riscv-run-gate.sh: compile -> llc -> riscv64-linux-gnu-gcc link -> qemu-riscv64 run -> assert stdout). The run gate SKIPS cleanly until 'sudo apt install gcc-riscv64-linux-gnu qemu-user'; codegen + object are already verified.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
AffineScript -> riscv64 (rv64gc/lp64d) -> riscv64-linux-gnu-gcc -no-pie -> qemu-riscv64 prints 'Hello, AffineScript!' and exits 0. just riscv-run-validate now green (4/4). -no-pie is required because the runtime references glibc globals (@stdin) via absolute R_RISCV_HI20 relocations a PIE forbids. The RISC-V collaboration's native request is met end-to-end. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Adds --target to the compile command (cmdliner): full triples verbatim, plus shorthands x86_64/aarch64/arm64/aarch64-android/android/riscv64/riscv32/ios. Precedence: --target > AFFINESCRIPT_LLVM_TRIPLE env var > x86_64 default (unchanged). Threaded as ?triple through codegen_llvm/generate + resolve_triple in lib/llvm_codegen.ml. Verified end-to-end: 'compile examples/hello.affine --target riscv64' -> llc -> link -> qemu-riscv64 prints 'Hello, AffineScript!' exit 0; aarch64-android + default x86_64 triples correct; native x86 executable fix preserved (ret i32 0 / runs). Full suite green. Integrated from a backgrounded worktree agent (feat/target-flag), but re-applied by hand onto current HEAD rather than merged: the agent's worktree was based on a STALE pre-session commit (58dc2a0) and merging it would have reverted the native executable fix (TyTuple [] -> void, main -> i32). Only the flag additions were lifted; the executable fix is intact. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Maps AffineScript testing+benching onto the estate-canonical taxonomy rather than a generic one. Authorities: standards/testing-and-benchmarking/TESTING-TAXONOMY.adoc (16 categories + 14 aspects + Six-Sigma benches + CRG/TRG/RSR + weakest-repo rule) and proven's reusable model (e2e.sh proof-chain, prop_/e2e_/aspect_ naming, symbol-audit interop guard, cargo-fuzz boundary fuzzing, Criterion benches, two-tier build, honest gap ledgers). Includes: per-category coverage (mapping this session's 8 gates + the alcotest suite), per-aspect coverage, Six-Sigma bench plan, an interoperability ledger (typed-wasm/coprocessor/native-ABI seams + the symbol-audit to adopt), a risk-transfer ledger (the loud-fails/carve-outs as explicit risk shifts), and a reuse-from-proven plan. Honest gap-marking (PBT/FUZ/MUT/CTR/ACC/baselined-benches) per proven's 'no fake placeholders' convention. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
lib/solo_cesk.ml: the semantics VM's first milestone. A CESK abstract machine (explicit Control/Env/Kontinuation) with a DEFUNCTIONALISED continuation (frame list = data, not host closures) — the stepping-stone toward the bytecode VM (one more defunctionalisation away, semantics-preserving). Realises the four ADR-0025 decisions: CESK substrate (Q1), VM-native tagged values with QTT quantity on each binding (Q2), the reduced Solo core mirroring Soundness.idr (Q3), and affine enforcement + tropical cost-metering ON by default with an enforce=false / unchecked escape (Q4). Solo core: STLC + Unit + ⊗-products + ⊕-sums + let, de Bruijn, eager CBV; Fst/Snd drop the unprojected component (the runtime face of affine preservation). The CESK transitions ARE the Solo small-step Step relation (the proof-oracle coupling, from M1). Tests (test/test_solo_cesk.ml, wired into the gate — 477 total, green): CRG-mapped — UT one-per-rule, EXE runs, CTR the affine invariant (over-use rejected ON, allowed under unchecked, ω reuse, under-use ≤1 ok, erased-0 rejected), PER cost accumulation + budget→Infeasible, plus a determinism check. Full qcheck PBT + differential-vs-interp oracle + step-rate bench are the matrix gaps (docs/TESTING-AND-BENCH-MATRIX.adoc). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Per-repo CRG instance (the artifact the standards taxonomy requires). Measured blitz 2026-06-16: 36k LOC, 477 alcotest green, 48 backends; 16-category status with honest GAPs (PBT/FUZ/MUT/REF/baselined-benches); performance numbers (compile 2-3ms/backend, native exec 1ms x86 / 11ms riscv-qemu, proof-check 47s = ~all Agda, Idris 365ms + Lean 185ms trivial); benches present but visibility-only/non-baselined + recipe half-broken. Self-assessed CRG D approaching C. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Three remedies to the TEST-NEEDS.md bench gaps: (1) 'just bench' second command was broken (dune runtest @bench — @bench is an alias-rule, not a runtest target); recipe now 'dune exec bench/bench_main.exe'. (2) bench_main wrapped each bench in Alcotest.run, which CAPTURES per-test stdout — so the numbers were computed but never printed; the runner now calls each run () directly. (3) added bench_scaling (generated N-function programs, parse+resolve+codegen, µs/func) and bench_vm (Solo CESK step-rate). FINDING surfaced immediately: compile-time scaling is SUPER-LINEAR (~O(n^2)) — 4.4 µs/func at n=100 but 80 µs/func at n=5000 (5x input -> ~32x time). Invisible on the 114-line corpus. Filed as issue-draft 07. The VM is clean: exactly 3n+1 steps (linear), ~3.5e7 steps/sec steady. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The scaling bench found compile time is ~O(n^2) past ~1000 functions (4.4 µs/func at n=100 -> 80 µs/func at n=5000). issue-draft 07 records the data + the prime suspects (resolve.ml symbol table / codegen.ml per-function full scan as List.assoc/List.nth) + the next step (phase-split the bench to localise, swap to Hashtbl, target flat µs/func). TEST-NEEDS.md bench section updated: harness fixed + numbers visible + scaling/VM benches added. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…posed) Scopes the formal-complexity track (option C). Machine-check compiler-phase asymptotic bounds in Isabelle/HOL using the AFP running-time tooling (Time_Monad, Amortized_Complexity, Akra-Bazzi), homed in tropical-resource-typing (where Isabelle + the (min,+) cost semiring already live — 101 .thy, incl. Tropical_Kleene/Tropical_v2). Complementary to the empirical bench: proof = asymptotic class, bench = constants; a tied 'resolve is O(n)' proof would have flagged today's O(n^2). F1 = model resolve in the Time_Monad, prove O(n), write a faithfulness obligation vs lib/resolve.ml. Isabelle scoped narrowly (NOT promoted into the Coq/Idris2/Lean/Agda primary matrix). Status: Proposed; 4 sign-off questions. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…=5000) issue-draft 07, localised by the phase-split scaling bench to lib/codegen.ml (parse+resolve are linear). (1) gen_decl appended to funcs/func_indices/ownership_annots with xs @ [x] (O(len) each) -> O(n^2); now cons + List.rev once at emission (all_funcs, build_ownership_section) — indices unchanged (derived from List.length, order-independent). (2) func_idx used List.length ctx.funcs per function -> O(n^2); now an O(1) num_funcs counter field. codegen at n=5000: 453ms -> 70ms (~6.5x); µs/func 90.6 -> 14.1. Verified byte-identical: 477 tests + wasm-validate (21/0/5) green. Residual mild super-linearity remains (a third smaller source — needs a profiler to localise; types-intern/exports/Effect_sites ruled out); ADR-0026 F1 is the durable guard. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…gate Closes part B of the perf programme — the per-backend RUNTIME bench gap (TEST-NEEDS.md), with proven-style real, correctness-checked workloads. bench/workloads/lp_tropical.affine: linear programming over the tropical (min,+) semiring (the same algebra as the VM cost-meter + tropical-resource-typing Isabelle dev) — shortest path via min/+, asserts the optimum 9. bench/workloads/nlp_newton.affine: non-linear Newton/Babylonian integer sqrt, asserts isqrt(144)=12. Pure Int so both run on every backend (no Float-heap). just workload-bench (tools/workload-bench.sh): each workload -> core wasm (emit+validate) AND native (clang -> run -> assert verdict token -> time). Verified 4/4: LP wasm 523B + native LP_TROPICAL_OK ~1046µs; NLP wasm 535B + native NLP_NEWTON_OK ~1220µs. Proves the backends execute recursive arithmetic programs end-to-end with correct results. Float-NLP (gradient descent) + array workloads are follow-ups gated on the durable Float-heap fix (task #6). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The regular TopFn path always-appended a fresh func_type to ctx.types (List.length + @ [..], both O(len) per decl) while only the extern path interned — so ctx.types grew one-per-function → the residual O(n²) (issue-draft 07). Route TopFn through intern_func_type too: equal type → existing index, new type → appended at the same end position, so all prior type indices are preserved; the Wasm type section is now deduped. codegen scaling: n=5000 70ms → 8ms (~8.6x further, ~50x vs original 401ms); curve now flat 0.8→1.6 µs/func across n=100→5000. 477 tests + wasm-validate (21/0/5) green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Threads the typechecker's per-node types into codegen the idiomatic way: synth records heap nodes whose CELL type is Float (an Array literal with a Float element, an index a[i] yielding Float) into float_heap_sites; elaborate_string_concat rewrites those exact nodes into new AST constructors ExprFloatArray / ExprFloatIndex; codegen lays them out with a 4-byte length header + 8-byte f64 cells (f64.load/f64.store, 8-byte stride). Because synth sees every node's checked type, coverage is total — every Float construction and every Float-yielding access is caught regardless of how the array flowed there, so codegen never guesses a cell width (the gap that made a codegen-local fix unsafe and kept the interim loud-fail). Array[Float] construct / read a[i] / write a[i]=e now validate AND round-trip the f64 correctly on wasmtime (FARR_OK / WRITE_OK); nested Array[Array[Float]] works. guard_no_heap_float's Array case lifted. Tuples, records, captured floats, and float compound-assign still loud-fail (no silent corruption) — next increments of the same pattern. Passes: 477 tests, wasm-validate gate (26/0/5, incl. 5 positive Array[Float] checks + 2 wasmtime round-trips). New constructors ripple only to post-elaborate passes (interp re-dispatches, opt folds, effect_sites census); other backends use wildcards. issue-draft 05 updated. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Extends the Float wall to tuples (issue-draft 05), same record->elaborate->lower pattern as arrays. synth records an all-Float ExprTuple / an all-Float tuple's ExprTupleIndex; elaborate rewrites to ExprFloatTuple / ExprFloatTupleIndex; codegen lays them out with 8-byte f64 cells (no length header, field i at i*8). Only ALL-Float tuples are handled — a mixed (Int,Float) tuple has type-dependent field offsets, so it keeps loud-failing (guard's TyTuple case lifts only for all-scalar-Float). Closes reproducer (b): (Float,Float) construct + t.i read validate and round-trip on wasmtime (FTUP_OK). Composition is free: Array[(Float,Float)] (i32 pointers to f64-tuples) construct + a[i].0 read round-trips (AFT_OK), and nested Array[Array[Float]] validates. All three original issue-draft 05 reproducers (a,b,c) now closed. Passes: 477 tests, wasm-validate gate 30/0/5 (9 positive Float-heap checks incl. 4 wasmtime round-trips), typed-wasm gate 2/0. Records, mixed tuples, and captured-float closures still loud-fail (no silent corruption). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… cells) Generalises the all-Float tuple (ExprFloatTuple) into ExprCellTuple/ ExprCellTupleIndex carrying per-cell kinds. A tuple with any scalar Float field now uses a UNIFORM 8-byte cell layout: field i at offset i*8 regardless of the field-type mix, with the load/store opcode chosen per cell (f64 for Float, i32 otherwise — i32 writes the low 4 bytes of the 8-byte slot). Uniform-8 sidesteps type-dependent offset accumulation, so (Int,Float) and (Float,Int) both work. synth records per-field cell kinds (cell_tuple_sites) and, for EVERY access to a float-bearing tuple (not just float fields, since the whole tuple is uniform-8), the accessed field's kind (cell_tuple_index_sites); elaborate rewrites via List.assq. Guard's TyTuple case fully lifted (tuples handled; unhandled inner aggregates loud-fail at their own site). Verified on wasmtime: (Int,Float)->MIX_OK, (Float,Int)->FI_OK, plus the prior all-Float/array/nested round-trips. 477 tests, wasm gate 32/0/5. Records and captured-float closures still loud-fail (next). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…s designed Records design recorded: closed float-bearing records get sorted-by-name uniform-8 layout (construct/access derive identical offsets from names); open/polymorphic rows keep loud-failing; value round-trips mandatory. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…name 8-byte)
Extends the Float wall to closed float-bearing records. To guarantee
construct/access agree on field->offset without depending on literal-vs-type
order (a wrong offset is silent corruption, not an invalid module), fields are
placed by NAME sorted ascending in uniform 8-byte cells. synth records the
per-field kinds for an ExprRecord (closed, no spread) and, for every field
access on a closed float-bearing record, bakes (byte_offset, is_f64) computed
from the closed row; elaborate rewrites to ExprCellRecord / ExprCellField; codegen
sorts by name to place cells. Open/polymorphic record rows yield no layout and
keep loud-failing (guard's TyRecord case lifts only when the row is closed).
Verified on wasmtime incl. the critical literal-order != name-order case
(#{b:2.0,a:1.0} -> REC_ORDER_OK) and mixed Float/Int records (REC_MIX_OK).
477 tests, wasm gate 35/0/5 (15 positive Float-heap checks, 8 round-trips).
Remaining (task #8): captured Float in closures (env cells are i*4, float-
specific UnboundVariable today — loud, not silent) and float array compound-
assign (clean loud-fail). issue-draft 05 to follow.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…closures Two closure fixes (issue-draft 05): 1. find_free_vars (codegen, runs on the post-elaborate tree) had a |_->[] catch-all that skipped ExprFloatBinary and the new cell nodes, so a variable captured ONLY inside a float expression was missed -> the closure mis-lowered to UnboundVariable. Now traverses all float/cell nodes like their pre- elaboration forms. 2. f64 in a closure (captured Float, Float param, or Float result) now loud-fails cleanly with UnsupportedFeature instead of UnboundVariable/invalid wasm. Full support needs an f64-aware closure calling convention (env cells, lambda param/result/local types, CallIndirect type) — a larger ABI change than the aggregate-cell work, tracked as task #8. Int closures unaffected (validate). 477 tests, wasm gate 36/0/5 (15 positive Float-heap round-trips + 2 clean loud-fail rejects). Every Float transiting a heap AGGREGATE (array, tuple all/mixed, closed record) now lowers correctly; residual rejects are the closure ABI + compound-assign + open records. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… 08)
A borrow bound through an if/match/block value (e.g. `let r = if c { pick(a) }
else { pick(a) }`) silently lost its origin, so a later move of `a` while `r`
was live was accepted — a use-after-move false negative, same loan-propagation
class as #554 but on the caller side. Found by adversarial probing; closeable
without full Polonius.
Root cause was in the CHECKING pass, not the return-borrow summary:
- check_block restored state.borrows to block-entry and cleared result_borrows
at exit, so a block/branch whose VALUE is a returned borrow of an OUTER place
swallowed that borrow.
- ExprIf/ExprMatch joins intersected branch borrows by b_id; each branch mints a
distinct record for the same origin, so the intersection dropped both.
- record_ref_binding (and the StmtAssign reassign path) only claimed
result_borrows for an ExprApp value.
Fix:
- New helper value_escaping dispatches on value SHAPE (call/if/match/block ->
result_borrows channel; &p/ref-var -> ref_source_borrow; else none), so a
stale channel from an earlier sibling statement is never mis-attributed.
- check_block computes the tail's escaping borrows BEFORE the lexical restore
(filtering out block-local-owned ones — those die / are caught by
BorrowOutlivesOwner) and re-publishes them past the restore.
- ExprIf/ExprMatch re-publish the UNION of branch/arm escaping borrows (the
value is one tail or another; union is the sound merge, never drops an
origin).
- record_ref_binding + StmtAssign reassign claim result_borrows for
ExprApp|ExprIf|ExprMatch|ExprBlock.
Verified: 8 unsound reproducers + a 2nd adversarial round (if-of-match,
match-of-block, try-bound, tuple-smuggled, ref-var-forwarding block, deep mix)
all reject (MoveWhileBorrowed); anti-over-rejection (NLL use-before-move,
unrelated move, value-blocks) passes. 6 hardening fixtures added
(test/e2e/fixtures/borrow_cond_origin_*). Full suite green: 483 tests (+6).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Format-only conversion (pandoc, headings demoted, MPL-2.0 SPDX + owner attribution header). Dollar signs are all shell snippets in code fences (no LaTeX); code blocks preserved verbatim. No internal doc links. 14 cold session-log/history files. Completes the docs .md→.adoc sweep. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Format-only conversion (pandoc, heading demoted, MPL-2.0 SPDX + owner attribution header). Repoints ../ALIB-INTEGRATION link to .adoc. The ../specs/affinescript-spec.md reference is a pre-existing dangle (target never created) — left verbatim. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…tion (batch B link pass)
Repoints internal nav links to the now-.adoc standards docs
(ROADMAP/DECISIONS/RELEASE) in NAVIGATION, README, PANIC-ATTACK, TESTING.
These four pre-existing .adoc files predated the strict pre-commit owner-
string check and carried attribution drift ('Hyperpolymath Contributors'
/ bare 'hyperpolymath'); staging them to fix the links tripped the hook.
Per owner authorization (extends the batch-B SPDX/attribution exception
to these touched files), reconciled each to the canonical
'hyperpolymath (Jonathan D.A. Jewell <j.d.a.jewell@open.ac.uk>)' form,
preserving each file's existing copyright year. Hook now passes without
modification — its drift-detection worked as intended.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Ground-truthed record of task #12: batches done, files held back and why, the hook-caught attribution-drift reconciliation, open owner-gated flags (uninstantiated SECURITY.md/ABI-FFI.md templates, LICENSING-GUIDE.md, dangling SECURITY.md:388 README link, stale AFFIRMATION.adoc:117 note), and the pre-existing dangling doc links left visible. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…arms
The C backend's ExprHandle arm compiled the body and discarded every
handler arm (`handle 41 { return(v) => v+1 }` emitted 41, not 42), and
ExprResume was a silent argument passthrough — the same silent wrong-value
miscompile #555 fixed in the WASM/WasmGC/Deno-ESM/JS-text backends. Now
failwith (caught by codegen_c → Error), matching the loud-fail policy; use
the interpreter for algebraic effects.
Adds an E2E loud-fail test for the C backend. Documents (in the handler-
fence test block) that the Lean/Why3 text backends are experimental stubs
which drop `return` statements wholesale — a broader incompleteness than
#555, flagged separately rather than given a misleading handle-only fence.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Trait.check_coherence was a no-op stub ("TODO: Check for overlapping
impls") and was never called, so two impls of the same trait for the same
(or unifiable) self type were silently accepted and method resolution
picked whichever impl came first — an unsoundness.
- Implement check_coherence: two impls overlap iff their self types unify
after each is instantiated with fresh vars for its own type params (reuses
fresh_impl_self_ty + Unify.unify; the existing OverlappingImpl error).
- Add check_all_coherence over every registered trait.
- Wire it into Typecheck.check_program after the forward pass registers all
impls; surface via a new, clearly-named TraitCoherenceError type_error.
- Tests (E2E Trait Coherence #559): duplicate concrete impl → rejected;
impls for distinct types → accepted; distinct generic args
(Pair[Int,Bool] vs Pair[Bool,Int]) → accepted (no over-reject).
Known follow-up (documented in-test): generic-subsumption overlap
(impl[T] for Box[T] vs impl for Box[Int]) is not yet detected — it rides on
generic-impl handling that has separate pre-existing immaturities; the
soundness hole #559 named (silent acceptance of overlapping concrete impls)
is closed. 525 tests green.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…actor
Closes the borrow-vs-borrow divergence class (2 of the 9 allowlisted
extractor↔lexical disagreements). The extractor now emits a conflict for a
direct read of a place while a live `&mut` (exclusive) loan covers it:
* `let b = &mut x` reads x at its creation point while the first `&mut x`
is still live → borrow_mutref_conflict now AGREES;
* `let y = x` (plain read) while `&mut x` is live → borrow_mutref_use_while
now AGREES.
Implementation: add rl_excl to the loan record (threaded from rhs_borrows);
a post-pass scans use_at × loans and, for each read of var v while an
EXCLUSIVE loan rooted at v is live, emits conflict_at — excluding the loan's
own birth point (that read IS the creation) and reads through the borrow
(`*a` has root a, not x, via PlaceDeref). Emission is liveness-gated by the
solver, so NLL last-use shortening keeps valid code accepted; shared `&`
loans never trigger it.
Tests: t_extract_mutref_conflict / _use_while (flagged), t_extract_shared_
read_ok (new fixture — read-while-shared-borrowed accepted, anti-over-reject).
Allowlist 9→7; corpus parallel-run gate green (no new divergence). 528 tests.
Remaining M3 divergences: return-escape/outlives-owner (4, needs escape
analysis), captured-linear (2), call-aliasing mut-param-arg (1).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…553/#555/#556/#558/#559) The canonical .claude/CLAUDE.md soundness survey is stale. Captured the corrected, test-verified status (528 tests green) in the committed tidy-up ledger instead of editing CLAUDE.md, which lacks an SPDX/owner header and is thus owner-only to commit. Owner to fold into CLAUDE.md manually. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Self-contained, purpose-curated handoff: mission + working rules, repo identity (NOT ephapax), env/commands (OCaml 4.14, no native effects), the pre-commit hook rules, file map, the ground-truthed soundness state (with an explicit 'CLAUDE.md survey is STALE' warning), this session's commits, and prioritized open work with exact next steps (#555 CPS residual [needs owner steer], M3 remaining divergences, #559 generic-subsumption, owner-manual items). The next Claude can pick up cold by reading one file. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ctor Close the call-aliasing divergence (borrow_use_while_excl): passing `x` to a `mut` parameter and reading `x` again in the SAME call — `mut_then_read(x, x)` — was flagged by the lexical checker (UseWhileExclusivelyBorrowed) but not by the Polonius extractor, so the fixture sat on the parallel-run allowlist. The extractor now mirrors the lexical checker's left-to-right argument fold: a `mut`-param argument that denotes a place mints a CALL-SCOPED exclusive loan (born at the enclosing point, killed at point+1 so it never reaches the next statement), and any LATER argument of the SAME call whose place overlaps an earlier `mut`-arg's loan emits a conflict. Order- and call-scoped, exactly matching the fold: it cannot fire across statements, so `just_mut(x); read_int(x)` stays accepted, and it over-approximates only toward MORE real same-call conflicts the lexical checker also reports — no false positive. The loans are emitted as raw borrow_at/killed/conflict_at facts only (not pushed onto the loan list), so the generic use-while rule and the move passes are untouched; the solver invalidates them via liveness. Allowlist 7 -> 6 (borrow_use_while_excl pruned). Corpus parallel-run gate stays green (117 fixtures, zero unexpected divergence). Adds two hardening tests: the call-aliasing flag, and the call-scoped anti-over-rejection case. 528 -> 530 tests green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
## What Merges current `main` into `feat/solo-core-metatheory-proofs` and resolves all conflicts, so that **#614 stops being `mergeable_state: dirty`**. While #614 is dirty, GitHub cannot build the merge commit, so its entire CI suite is suppressed (only `lint-workflows` runs). Merging this PR into the feature branch un-blocks #614's CI. ## Why it was needed #614 branched off an old `main` (merge-base 61 commits back) and conflicts with the standalone-CI + codegen work that has since landed (#602, #603, #604, #606, #609, #610, #611, #612, #613). ## Conflict resolutions (5 files) | File(s) | Resolution | |---|---| | `governance.yml`, `scorecard.yml`, `scorecard-enforcer.yml`, `hypatia-scan.yml` | Take `main`'s **standalone** versions. The branch re-adopted the estate `standards` reusables that `main` deliberately dropped (#603/#604) to stop run-creation `startup_failure`s. `main`'s `hypatia-scan.yml` also restores the permissions Hypatia needs (`security-events: write`, `pull-requests: write`, `secrets: inherit`) and the `MPL-2.0` SPDX id the Palimpsest license doc mandates for tooling. | | `docs/PROOF-NEEDS.md` | Drop the branch's stale 103-line `.md`; keep `main`'s canonical 359-line `.adoc` (#609). Also satisfies DOC-FORMAT. | | `docs/history/MODULE-SYSTEM-PROGRESS` | Keep the branch's `.md`→`.adoc` migration; **port** `main`'s additive #138 codegen-follow-up note + status-table row into the `.adoc` so `main`'s work is preserved. | `spark-theatre-gate.yml` and `mirror.yml` were identical to `main`. ## Verification (merged tree) - `dune build` — clean - `dune test` — **534/534 pass** (incl. `cross-module constructor linking, Wasm (#138)`, `Wasm nested tuple patterns`, `Deno-ESM / JS no duplicate Option/Result constructor`) - wasm-runtime harness (`tools/run_codegen_wasm_tests.sh`) — all pass - workflow scan — no `startup_failure` risk introduced ## How to use Merge this into `feat/solo-core-metatheory-proofs`. #614 then becomes mergeable and its full CI runs. > Routed via this branch because the environment only permits pushes to `claude/inspiring-newton-dg5wov`, not directly to the feature branch. Un-blocks #614. 🤖 Generated with [Claude Code](https://claude.com/claude-code) https://claude.ai/code/session_01Lz7pRcec2Z3tVtaAhvB3M8 --- _Generated by [Claude Code](https://claude.ai/code/session_01Lz7pRcec2Z3tVtaAhvB3M8)_ --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: hyperpolymath <paraordinate@yahoo.co.uk>
…ict (#616) ## What Clears the **last** merge conflict blocking #614. After #615 was **squash-merged**, the feature branch gained `main`'s content but not its ancestry, so #614's 3-way merge re-derived one conflict: `docs/history/MODULE-SYSTEM-PROGRESS` (modified on `main` in #602 / migrated to `.adoc` + deleted on this branch). ## Fix Match `main` exactly for that one file — restore `main`'s `.md`, drop the `.adoc`. **No content is lost**: `main`'s `.md` already carries the #138 codegen-follow-up note. The `.md`→`.adoc` migration can be redone later as a standalone DOC-FORMAT change. This resolution is **squash-robust** — it makes the file modify/modify-identical rather than modify/delete, so #614 stays conflict-free regardless of how this PR is merged. ## Verified - Trial merge of the fixed feature branch into `main`: **0 conflicts** ("Automatic merge went well"). - `dune build` clean (docs-only delta from the already-green #615 tree — `dune test` 534/534 there). Merging this into `feat/solo-core-metatheory-proofs` makes **#614 mergeable** and triggers its full CI. 🤖 Generated with [Claude Code](https://claude.com/claude-code) https://claude.ai/code/session_01Lz7pRcec2Z3tVtaAhvB3M8 --- _Generated by [Claude Code](https://claude.ai/code/session_01Lz7pRcec2Z3tVtaAhvB3M8)_ Co-authored-by: Claude <noreply@anthropic.com>
🔍 Hypatia Security ScanFindings: 42 issues detected
View findings[
{
"reason": "Action denoland/setup-deno@v2 needs attention",
"type": "unpinned_action",
"file": "publish-jsr.yml",
"action": "pin_sha",
"rule_module": "workflow_audit",
"severity": "medium"
},
{
"reason": "Issue in scorecard-enforcer.yml",
"type": "scorecard_publish_with_run_step",
"file": "scorecard-enforcer.yml",
"action": "split_scorecard_publish_job",
"rule_module": "workflow_audit",
"severity": "high"
},
{
"reason": "Issue in instant-sync.yml",
"type": "secret_action_without_presence_gate",
"file": "instant-sync.yml",
"action": "peter-evans/repository-dispatch",
"rule_module": "workflow_audit",
"severity": "high"
},
{
"reason": "Shell execution -- validate input before passing to shell (1 occurrences, CWE-78)",
"type": "js_exec_sync",
"file": "/home/runner/work/affinescript/affinescript/packages/affinescript-cli/mod.js",
"action": "flag",
"rule_module": "code_safety",
"severity": "high"
},
{
"reason": "Shell execution -- validate input before passing to shell (2 occurrences, CWE-78)",
"type": "js_exec_sync",
"file": "/home/runner/work/affinescript/affinescript/packages/affine-vscode/mod.js",
"action": "flag",
"rule_module": "code_safety",
"severity": "high"
},
{
"reason": "Shell execution -- validate input before passing to shell (1 occurrences, CWE-78)",
"type": "js_exec_sync",
"file": "/home/runner/work/affinescript/affinescript/affinescript-vite/src/affine-plugin-improved.js",
"action": "flag",
"rule_module": "code_safety",
"severity": "high"
},
{
"reason": "expect() in hot path (32 occurrences, CWE-754)",
"type": "expect_in_hot_path",
"file": "/home/runner/work/affinescript/affinescript/affinescriptiser/src/codegen/wasm_gen.rs",
"action": "flag",
"rule_module": "code_safety",
"severity": "medium"
},
{
"reason": "expect() in hot path (29 occurrences, CWE-754)",
"type": "expect_in_hot_path",
"file": "/home/runner/work/affinescript/affinescript/affinescriptiser/src/codegen/affine_gen.rs",
"action": "flag",
"rule_module": "code_safety",
"severity": "medium"
},
{
"reason": "unsafe block -- requires SAFETY comment (2 occurrences, CWE-676)",
"type": "unsafe_block",
"file": "/home/runner/work/affinescript/affinescript/runtime/src/panic.rs",
"action": "flag",
"rule_module": "code_safety",
"severity": "medium"
},
{
"reason": "unsafe block -- requires SAFETY comment (1 occurrences, CWE-676)",
"type": "unsafe_block",
"file": "/home/runner/work/affinescript/affinescript/runtime/src/alloc.rs",
"action": "flag",
"rule_module": "code_safety",
"severity": "medium"
}
]Powered by Hypatia Neurosymbolic CI/CD Intelligence |
hyperpolymath
added a commit
that referenced
this pull request
Jun 21, 2026
…617) ## What Repairs `main`'s **red `CI`** after #614 merged. The `build` job's `./tools/check-doc-truthing.sh` governance guard failed because **#614 migrated six status/history docs from `.md` → `.adoc`**, but the guard still keyed off the old `.md` paths in two places: - **Presence list** (`BANNERED_DOCS`) → `bannered doc is missing` for `BACKEND-IMPLEMENTATION`, `COMPILER-CAPABILITIES`, `ALPHA-1-RELEASE-NOTES`. - **Over-claim ratchet baseline** (`tools/doc-overclaims.allow`) → `new over-claim(s)` for the migrated `.adoc` history/roadmap snapshots (identical, already-blessed claims — just at `.adoc` paths with `.adoc` markup). On the previous `main` tip (`2aa00ff`) this guard was green, so this is a regression introduced purely by the migration — not new over-claiming. ## Fix (hotfix + hardening) 1. **Re-key the over-claim baseline** to the migrated `.adoc` paths via the tool's documented `--update`. This is a **1:1 re-bless of identical historical/roadmap content** — every removed `.md` signature has an equivalent `.adoc` one added; **zero new live over-claims**. 2. **Harden the presence check** so it can't re-break on a future migration: `BANNERED_DOCS` are now extension-less **stems** resolved to whichever of `.adoc`/`.md` is present (`.adoc` preferred). ## Verification - `./tools/check-doc-truthing.sh` → exit 0: *"OK: doc-truthing intact — presence invariants + over-claim ratchet (DOC-04/05/08/09)."* - Resolver unit-tested: both-present → `.adoc`; only `.md` → `.md`; neither → missing. - `dune build` clean. (The only `build`-job step after the guard is `dune build @fmt`; this change touches no OCaml, so it's unaffected.) Greens the `build` job / `CI` on `main`. 🤖 Generated with [Claude Code](https://claude.com/claude-code) https://claude.ai/code/session_01Lz7pRcec2Z3tVtaAhvB3M8 --- _Generated by [Claude Code](https://claude.ai/code/session_01Lz7pRcec2Z3tVtaAhvB3M8)_ Co-authored-by: Claude <noreply@anthropic.com>
hyperpolymath
added a commit
that referenced
this pull request
Jun 21, 2026
#614 added lib/borrow_polonius/dune unformatted; the build job's `dune build @fmt` requires a blank line between the comment block and the (library ...) stanza. It was masked on main until #617 fixed the doc-truthing guard that failed earlier in the same job. Only the blank line is added (canonical `dune format-dune-file` output); no other dune files are touched — root dune-project / .build/dune-project "drift" only under newer local dune and is not flagged by CI's dune. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01Lz7pRcec2Z3tVtaAhvB3M8
hyperpolymath
added a commit
that referenced
this pull request
Jun 21, 2026
## What Greens `main`'s `build` job. After #617 fixed the doc-truthing guard, the next `build` step — `dune build @fmt` — fails because **#614 added `lib/borrow_polonius/dune` without the blank line** `dune fmt` requires between the comment block and the `(library …)` stanza. ## Fix Add the single blank line (canonical `dune format-dune-file` output). Nothing else changes. > Note: root `dune-project` / `.build/dune-project` show "drift" only under newer **local** dune (3.14.0) and drift identically on `2aa00ff` where CI's `@fmt` was green — i.e. CI's dune does **not** flag them, so they're deliberately left untouched. ## Why this is a separate PR #614 was merged (bypass) while its CI was incomplete — the `build` job died at doc-truthing *before* reaching `@fmt`, so its `@fmt` compliance was never checked. #617 fixed the first failure; this fixes the one it unmasked. After this, the `build` job should be green. Verified: `dune format-dune-file lib/borrow_polonius/dune` is now a no-op. 🤖 Generated with [Claude Code](https://claude.com/claude-code) https://claude.ai/code/session_01Lz7pRcec2Z3tVtaAhvB3M8 --- _Generated by [Claude Code](https://claude.ai/code/session_01Lz7pRcec2Z3tVtaAhvB3M8)_ Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.